Overview

Dataset statistics

Number of variables20
Number of observations4410
Missing cells28
Missing cells (%)< 0.1%
Duplicate rows2912
Duplicate rows (%)66.0%
Total size in memory689.2 KiB
Average record size in memory160.0 B

Variable types

NUM12
CAT6
BOOL2

Reproduction

Analysis started2020-07-23 15:31:15.401606
Analysis finished2020-07-23 15:32:03.634493
Duration48.23 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 2912 (66.0%) duplicate rows Duplicates
NumCompaniesWorked has 586 (13.3%) zeros Zeros
TrainingTimesLastYear has 162 (3.7%) zeros Zeros
YearsAtCompany has 132 (3.0%) zeros Zeros
YearsSinceLastPromotion has 1743 (39.5%) zeros Zeros
YearsWithCurrManager has 789 (17.9%) zeros Zeros

Variables

Age
Real number (ℝ≥0)

Distinct count43
Unique (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.923809523809524
Minimum18
Maximum60
Zeros0
Zeros (%)0.0%
Memory size34.5 KiB
2020-07-23T21:02:03.805676image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile24
Q130
median36
Q343
95-th percentile54
Maximum60
Range42
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.133301271
Coefficient of variation (CV)0.2473553349
Kurtosis-0.4059505398
Mean36.92380952
Median Absolute Deviation (MAD)6
Skewness0.4130049527
Sum162834
Variance83.41719211
2020-07-23T21:02:04.045248image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
352345.3%
 
342315.2%
 
362074.7%
 
312074.7%
 
292044.6%
 
321834.1%
 
301804.1%
 
381743.9%
 
331743.9%
 
401713.9%
 
Other values (33)244555.4%
 
ValueCountFrequency (%) 
18240.5%
 
19270.6%
 
20330.7%
 
21390.9%
 
22481.1%
 
ValueCountFrequency (%) 
60150.3%
 
59300.7%
 
58421.0%
 
57120.3%
 
56421.0%
 

Attrition
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
0
3699
1
 
711
ValueCountFrequency (%) 
0369983.9%
 
171116.1%
 

BusinessTravel
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
Travel_Rarely
3129
Travel_Frequently
831
Non-Travel
 
450
ValueCountFrequency (%) 
Travel_Rarely312971.0%
 
Travel_Frequently83118.8%
 
Non-Travel45010.2%
 
2020-07-23T21:02:04.349993image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length17
Median length13
Mean length13.44761905
Min length10

Department
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
Research & Development
2883
Sales
1338
Human Resources
 
189
ValueCountFrequency (%) 
Research & Development288365.4%
 
Sales133830.3%
 
Human Resources1894.3%
 
2020-07-23T21:02:04.667652image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length22
Median length22
Mean length16.54217687
Min length5

DistanceFromHome
Real number (ℝ≥0)

Distinct count29
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.19251700680272
Minimum1
Maximum29
Zeros0
Zeros (%)0.0%
Memory size34.5 KiB
2020-07-23T21:02:04.894620image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median7
Q314
95-th percentile26
Maximum29
Range28
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.105025519
Coefficient of variation (CV)0.8816981805
Kurtosis-0.2270453549
Mean9.192517007
Median Absolute Deviation (MAD)5
Skewness0.9574657464
Sum40539
Variance65.69143866
2020-07-23T21:02:05.097620image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
263314.4%
 
162414.1%
 
102585.9%
 
92555.8%
 
72525.7%
 
32525.7%
 
82405.4%
 
51954.4%
 
41924.4%
 
61774.0%
 
Other values (19)133230.2%
 
ValueCountFrequency (%) 
162414.1%
 
263314.4%
 
32525.7%
 
41924.4%
 
51954.4%
 
ValueCountFrequency (%) 
29811.8%
 
28691.6%
 
27360.8%
 
26751.7%
 
25751.7%
 

Education
Real number (ℝ≥0)

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.912925170068027
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size34.5 KiB
2020-07-23T21:02:05.318758image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile4
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.023932629
Coefficient of variation (CV)0.3515135367
Kurtosis-0.5605690113
Mean2.91292517
Median Absolute Deviation (MAD)1
Skewness-0.2894838784
Sum12846
Variance1.048438028
2020-07-23T21:02:05.531596image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3171638.9%
 
4119427.1%
 
284619.2%
 
151011.6%
 
51443.3%
 
ValueCountFrequency (%) 
151011.6%
 
284619.2%
 
3171638.9%
 
4119427.1%
 
51443.3%
 
ValueCountFrequency (%) 
51443.3%
 
4119427.1%
 
3171638.9%
 
284619.2%
 
151011.6%
 

EducationField
Categorical

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
Life Sciences
1818
Medical
1392
Marketing
477
Technical Degree
396
Other
 
246
ValueCountFrequency (%) 
Life Sciences181841.2%
 
Medical139231.6%
 
Marketing47710.8%
 
Technical Degree3969.0%
 
Other2465.6%
 
Human Resources811.8%
 
2020-07-23T21:02:06.152015image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length16
Median length13
Mean length10.53333333
Min length5

Gender
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
1
2646
0
1764
ValueCountFrequency (%) 
1264660.0%
 
0176440.0%
 

JobLevel
Real number (ℝ≥0)

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0639455782312925
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size34.5 KiB
2020-07-23T21:02:06.367678image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile4
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.106688807
Coefficient of variation (CV)0.5362005755
Kurtosis0.3955253647
Mean2.063945578
Median Absolute Deviation (MAD)1
Skewness1.02470323
Sum9102
Variance1.224760115
2020-07-23T21:02:06.584521image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1162936.9%
 
2160236.3%
 
365414.8%
 
43187.2%
 
52074.7%
 
ValueCountFrequency (%) 
1162936.9%
 
2160236.3%
 
365414.8%
 
43187.2%
 
52074.7%
 
ValueCountFrequency (%) 
52074.7%
 
43187.2%
 
365414.8%
 
2160236.3%
 
1162936.9%
 

JobRole
Categorical

Distinct count9
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
Sales Executive
978
Research Scientist
876
Laboratory Technician
777
Manufacturing Director
435
Healthcare Representative
393
Other values (4)
951
ValueCountFrequency (%) 
Sales Executive97822.2%
 
Research Scientist87619.9%
 
Laboratory Technician77717.6%
 
Manufacturing Director4359.9%
 
Healthcare Representative3938.9%
 
Manager3066.9%
 
Sales Representative2495.6%
 
Research Director2405.4%
 
Human Resources1563.5%
 
2020-07-23T21:02:06.905464image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length25
Median length18
Mean length18.0707483
Min length7

MaritalStatus
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
Married
2019
Single
1410
Divorced
981
ValueCountFrequency (%) 
Married201945.8%
 
Single141032.0%
 
Divorced98122.2%
 
2020-07-23T21:02:07.224936image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.902721088
Min length6

MonthlyIncome
Real number (ℝ≥0)

Distinct count1349
Unique (%)30.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65029.31292517007
Minimum10090
Maximum199990
Zeros0
Zeros (%)0.0%
Memory size34.5 KiB
2020-07-23T21:02:07.462787image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum10090
5-th percentile20970
Q129110
median49190
Q383800
95-th percentile178560
Maximum199990
Range189900
Interquartile range (IQR)54690

Descriptive statistics

Standard deviation47068.88856
Coefficient of variation (CV)0.7238103317
Kurtosis1.000231855
Mean65029.31293
Median Absolute Deviation (MAD)21990
Skewness1.368884163
Sum286779270
Variance2215480270
2020-07-23T21:02:07.661469image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
23420120.3%
 
6142090.2%
 
2741090.2%
 
2404090.2%
 
2610090.2%
 
2380090.2%
 
5562090.2%
 
3452090.2%
 
6347090.2%
 
2559090.2%
 
Other values (1339)431797.9%
 
ValueCountFrequency (%) 
1009030.1%
 
1051030.1%
 
1052030.1%
 
1081030.1%
 
1091030.1%
 
ValueCountFrequency (%) 
19999030.1%
 
19973030.1%
 
19943030.1%
 
19926030.1%
 
19859030.1%
 

NumCompaniesWorked
Real number (ℝ≥0)

ZEROS

Distinct count10
Unique (%)0.2%
Missing19
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean2.6948303347756775
Minimum0.0
Maximum9.0
Zeros586
Zeros (%)13.3%
Memory size34.5 KiB
2020-07-23T21:02:07.890405image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q34
95-th percentile8
Maximum9
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.498886889
Coefficient of variation (CV)0.9272891345
Kurtosis0.007287480878
Mean2.694830335
Median Absolute Deviation (MAD)1
Skewness1.026766676
Sum11833
Variance6.244435683
2020-07-23T21:02:08.105253image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1155835.3%
 
058613.3%
 
347410.7%
 
24389.9%
 
44159.4%
 
72225.0%
 
62084.7%
 
51874.2%
 
91563.5%
 
81473.3%
 
(Missing)190.4%
 
ValueCountFrequency (%) 
058613.3%
 
1155835.3%
 
24389.9%
 
347410.7%
 
44159.4%
 
ValueCountFrequency (%) 
91563.5%
 
81473.3%
 
72225.0%
 
62084.7%
 
51874.2%
 

PercentSalaryHike
Real number (ℝ≥0)

Distinct count15
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.209523809523809
Minimum11
Maximum25
Zeros0
Zeros (%)0.0%
Memory size34.5 KiB
2020-07-23T21:02:08.320251image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile11
Q112
median14
Q318
95-th percentile22
Maximum25
Range14
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.659107516
Coefficient of variation (CV)0.2405800183
Kurtosis-0.3026383931
Mean15.20952381
Median Absolute Deviation (MAD)2
Skewness0.8205689838
Sum67074
Variance13.38906782
2020-07-23T21:02:08.528824image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1163014.3%
 
1362714.2%
 
1460313.7%
 
1259413.5%
 
153036.9%
 
182676.1%
 
172465.6%
 
162345.3%
 
192285.2%
 
221683.8%
 
Other values (5)51011.6%
 
ValueCountFrequency (%) 
1163014.3%
 
1259413.5%
 
1362714.2%
 
1460313.7%
 
153036.9%
 
ValueCountFrequency (%) 
25541.2%
 
24631.4%
 
23841.9%
 
221683.8%
 
211443.3%
 

StockOptionLevel
Categorical

Distinct count4
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size34.5 KiB
0
1893
1
1788
2
474
3
 
255
ValueCountFrequency (%) 
0189342.9%
 
1178840.5%
 
247410.7%
 
32555.8%
 
2020-07-23T21:02:08.812917image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

TotalWorkingYears
Real number (ℝ≥0)

Distinct count40
Unique (%)0.9%
Missing9
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean11.279936378095888
Minimum0.0
Maximum40.0
Zeros33
Zeros (%)0.7%
Memory size34.5 KiB
2020-07-23T21:02:09.018013image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median10
Q315
95-th percentile28
Maximum40
Range40
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.782222141
Coefficient of variation (CV)0.6899172017
Kurtosis0.9129359961
Mean11.27993638
Median Absolute Deviation (MAD)4
Skewness1.116831796
Sum49643
Variance60.56298145
2020-07-23T21:02:09.229403image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1060513.7%
 
63758.5%
 
83077.0%
 
92876.5%
 
52646.0%
 
72435.5%
 
12425.5%
 
41894.3%
 
121443.3%
 
31262.9%
 
Other values (30)161936.7%
 
ValueCountFrequency (%) 
0330.7%
 
12425.5%
 
2932.1%
 
31262.9%
 
41894.3%
 
ValueCountFrequency (%) 
4060.1%
 
3830.1%
 
37120.3%
 
36180.4%
 
3590.2%
 

TrainingTimesLastYear
Real number (ℝ≥0)

ZEROS

Distinct count7
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7993197278911564
Minimum0
Maximum6
Zeros162
Zeros (%)3.7%
Memory size34.5 KiB
2020-07-23T21:02:09.463544image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q33
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.28897817
Coefficient of variation (CV)0.4604612174
Kurtosis0.4911489985
Mean2.799319728
Median Absolute Deviation (MAD)1
Skewness0.5527476257
Sum12345
Variance1.661464722
2020-07-23T21:02:09.672491image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2164137.2%
 
3147333.4%
 
43698.4%
 
53578.1%
 
12134.8%
 
61954.4%
 
01623.7%
 
ValueCountFrequency (%) 
01623.7%
 
12134.8%
 
2164137.2%
 
3147333.4%
 
43698.4%
 
ValueCountFrequency (%) 
61954.4%
 
53578.1%
 
43698.4%
 
3147333.4%
 
2164137.2%
 

YearsAtCompany
Real number (ℝ≥0)

ZEROS

Distinct count37
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.0081632653061225
Minimum0
Maximum40
Zeros132
Zeros (%)3.0%
Memory size34.5 KiB
2020-07-23T21:02:09.909301image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median5
Q39
95-th percentile20
Maximum40
Range40
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.125135445
Coefficient of variation (CV)0.8740001072
Kurtosis3.923864205
Mean7.008163265
Median Absolute Deviation (MAD)3
Skewness1.763328232
Sum30906
Variance37.51728422
2020-07-23T21:02:10.134615image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
558813.3%
 
151311.6%
 
33848.7%
 
23818.6%
 
103608.2%
 
43307.5%
 
72706.1%
 
92465.6%
 
82405.4%
 
62285.2%
 
Other values (27)87019.7%
 
ValueCountFrequency (%) 
01323.0%
 
151311.6%
 
23818.6%
 
33848.7%
 
43307.5%
 
ValueCountFrequency (%) 
4030.1%
 
3730.1%
 
3660.1%
 
3430.1%
 
33150.3%
 

YearsSinceLastPromotion
Real number (ℝ≥0)

ZEROS

Distinct count16
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1877551020408164
Minimum0
Maximum15
Zeros1743
Zeros (%)39.5%
Memory size34.5 KiB
2020-07-23T21:02:10.368217image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile9
Maximum15
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.221699321
Coefficient of variation (CV)1.4726051
Kurtosis3.601760518
Mean2.187755102
Median Absolute Deviation (MAD)1
Skewness1.982939156
Sum9648
Variance10.37934651
2020-07-23T21:02:10.586020image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0174339.5%
 
1107124.3%
 
247710.8%
 
72285.2%
 
41834.1%
 
31563.5%
 
51353.1%
 
6962.2%
 
11721.6%
 
8541.2%
 
Other values (6)1954.4%
 
ValueCountFrequency (%) 
0174339.5%
 
1107124.3%
 
247710.8%
 
31563.5%
 
41834.1%
 
ValueCountFrequency (%) 
15390.9%
 
14270.6%
 
13300.7%
 
12300.7%
 
11721.6%
 

YearsWithCurrManager
Real number (ℝ≥0)

ZEROS

Distinct count18
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.12312925170068
Minimum0
Maximum17
Zeros789
Zeros (%)17.9%
Memory size34.5 KiB
2020-07-23T21:02:10.805837image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q37
95-th percentile10
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.567326744
Coefficient of variation (CV)0.8651988638
Kurtosis0.1679485428
Mean4.123129252
Median Absolute Deviation (MAD)3
Skewness0.8328836111
Sum18183
Variance12.7258201
2020-07-23T21:02:11.033509image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2103223.4%
 
078917.9%
 
764814.7%
 
34269.7%
 
83217.3%
 
42946.7%
 
12285.2%
 
91924.4%
 
5932.1%
 
6872.0%
 
Other values (8)3006.8%
 
ValueCountFrequency (%) 
078917.9%
 
12285.2%
 
2103223.4%
 
34269.7%
 
42946.7%
 
ValueCountFrequency (%) 
17210.5%
 
1660.1%
 
15150.3%
 
14150.3%
 
13421.0%
 

Interactions

2020-07-23T21:01:21.966705image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:22.220301image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:22.461097image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:22.721764image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:22.972257image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:23.222116image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:23.600668image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:23.839938image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:24.092742image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:24.331073image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:24.593272image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:24.836801image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:25.083777image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:25.323848image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:25.582734image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:25.851211image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:26.121155image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:26.387759image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:26.632704image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:26.888942image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:27.157365image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:27.411956image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:27.678706image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:27.940881image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:28.205838image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:28.466225image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:28.739536image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:29.026747image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:29.315835image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:29.603665image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:29.885756image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:30.164736image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:30.450798image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:30.725332image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:31.158070image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:31.439371image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:31.725844image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:31.985374image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:32.266014image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:32.553264image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:32.837409image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:33.120634image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:33.380988image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:33.654861image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:33.938712image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:34.218245image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:34.505296image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:34.789261image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:35.071486image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:35.322522image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:35.593241image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:35.874273image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:36.155991image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:36.432708image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:36.689719image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:36.958555image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:37.238503image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:37.522478image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:37.799206image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:38.072246image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:38.347101image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:38.662074image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:38.918637image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:39.171003image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:39.420273image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:39.668688image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:39.890973image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:40.126133image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:40.562426image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:40.831005image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:41.082967image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:41.325692image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:41.597161image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:41.848642image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:42.130021image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:42.399433image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:42.667439image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:42.933470image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:43.177865image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:43.437102image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:43.704808image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:43.961520image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:44.231826image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:44.500634image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:44.769450image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:45.030257image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:45.303548image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:45.603064image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:45.886091image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:46.166623image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:46.422942image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:46.697807image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:46.980339image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:47.261989image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:47.545117image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:47.823803image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:48.104147image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:48.355665image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:48.619781image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:48.885438image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:49.153031image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:49.417699image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:49.688604image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:49.950131image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:50.216501image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:50.471079image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:50.735821image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:50.998656image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:51.268623image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:51.526561image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:52.037754image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:52.325358image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:52.614425image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:52.909186image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:53.168517image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:53.441831image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:53.724028image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:53.995538image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:54.277633image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:54.559863image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:54.840944image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:55.093953image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:55.360898image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:55.634854image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:55.908980image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:56.182119image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:56.433999image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:56.698683image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:56.974826image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:57.235833image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:57.512243image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:57.779217image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:58.060637image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:58.311567image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:58.577287image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:58.854891image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:59.134438image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:59.411405image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:59.667606image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:01:59.934065image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:02:00.210792image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:02:00.477037image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:02:00.761831image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:02:01.039050image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Correlations

2020-07-23T21:02:11.295443image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-07-23T21:02:11.797686image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-07-23T21:02:12.302433image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-07-23T21:02:12.820930image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-07-23T21:02:13.347141image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-07-23T21:02:01.620069image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:02:02.570073image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:02:03.021935image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-07-23T21:02:03.269923image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Sample

First rows

AgeAttritionBusinessTravelDepartmentDistanceFromHomeEducationEducationFieldGenderJobLevelJobRoleMaritalStatusMonthlyIncomeNumCompaniesWorkedPercentSalaryHikeStockOptionLevelTotalWorkingYearsTrainingTimesLastYearYearsAtCompanyYearsSinceLastPromotionYearsWithCurrManager
0510Travel_RarelySales62Life Sciences01Healthcare RepresentativeMarried1311601.01101.06100
1311Travel_FrequentlyResearch & Development101Life Sciences01Research ScientistSingle418900.02316.03514
2320Travel_FrequentlyResearch & Development174Other14Sales ExecutiveMarried1932801.01535.02503
3380Non-TravelResearch & Development25Life Sciences13Human ResourcesMarried832103.011313.05875
4320Travel_RarelyResearch & Development101Medical11Sales ExecutiveSingle234204.01229.02604
5460Travel_RarelyResearch & Development83Life Sciences04Research DirectorMarried407103.013028.05777
6281Travel_RarelyResearch & Development112Medical12Sales ExecutiveSingle581302.02015.02000
7290Travel_RarelyResearch & Development183Life Sciences12Sales ExecutiveMarried314302.022310.02000
8310Travel_RarelyResearch & Development13Life Sciences13Laboratory TechnicianMarried204400.021010.02978
9250Non-TravelResearch & Development74Medical04Laboratory TechnicianDivorced1346401.01316.02615

Last rows

AgeAttritionBusinessTravelDepartmentDistanceFromHomeEducationEducationFieldGenderJobLevelJobRoleMaritalStatusMonthlyIncomeNumCompaniesWorkedPercentSalaryHikeStockOptionLevelTotalWorkingYearsTrainingTimesLastYearYearsAtCompanyYearsSinceLastPromotionYearsWithCurrManager
4400370Travel_RarelyResearch & Development225Medical02Manufacturing DirectorMarried305502.014317.03302
4401450Travel_FrequentlySales211Marketing13Research ScientistMarried228904.01309.03302
4402371Travel_FrequentlySales23Marketing11Laboratory TechnicianDivorced400106.011117.02100
4403390Travel_FrequentlyResearch & Development223Medical01Manufacturing DirectorSingle1296500.019120.0219118
4404290Travel_RarelySales43Other02Human ResourcesSingle353901.01806.02615
4405420Travel_RarelyResearch & Development54Medical01Research ScientistSingle602903.017110.05302
4406290Travel_RarelyResearch & Development24Medical11Laboratory TechnicianDivorced267902.015010.02302
4407250Travel_RarelyResearch & Development252Life Sciences12Sales ExecutiveMarried370200.02005.04412
4408420Travel_RarelySales182Medical11Laboratory TechnicianDivorced239800.014110.02978
4409400Travel_RarelyResearch & Development283Medical12Laboratory TechnicianDivorced546800.0120NaN62139

Duplicate rows

Most frequent

AgeAttritionBusinessTravelDepartmentDistanceFromHomeEducationEducationFieldGenderJobLevelJobRoleMaritalStatusMonthlyIncomeNumCompaniesWorkedPercentSalaryHikeStockOptionLevelTotalWorkingYearsTrainingTimesLastYearYearsAtCompanyYearsSinceLastPromotionYearsWithCurrManagercount
0180Non-TravelResearch & Development14Medical12Sales ExecutiveSingle272001.02210.020003
1180Non-TravelResearch & Development23Life Sciences13Sales RepresentativeSingle1860601.02420.040003
2180Non-TravelSales54Other12ManagerSingle323001.01210.030003
3180Travel_RarelySales73Life Sciences11Research ScientistSingle381201.01500.030003
4181Non-TravelResearch & Development24Medical13Laboratory TechnicianSingle1096501.01800.050003
5181Travel_FrequentlyResearch & Development23Technical Degree11Sales ExecutiveSingle346801.01820.040003
7181Travel_RarelyResearch & Development14Life Sciences11Sales ExecutiveSingle233501.01420.030003
8190Travel_RarelyResearch & Development13Other05Manufacturing DirectorSingle1520201.01831.021003
9190Travel_RarelyResearch & Development234Life Sciences12Laboratory TechnicianSingle1919701.01201.021003
10190Travel_RarelySales24Marketing13Laboratory TechnicianSingle1155701.02201.001013